Search CORE

11 research outputs found

Opportunistic linked data querying through approximate membership metadata

Author: BH Bloom
C Buil-Aranda
E Oren
G Aluç
I Ermilov
I Filali
M Schmachtenberg
R Gallager
R Verborgh
X Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Between URI dereferencing and the SPARQL protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute SPARQL queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical SPARQL query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golomb-coded sets, as extra metadata. In addition to reducing HTTP requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer HTTP requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface

Crossref

Ghent University Academic Bibliography

Diversified Stress Testing of RDF Data Management Systems

Author: C. Bizer
C. Bizer
D.J. Abadi
G. Aluç
L. Sidirourgos
L. Zou
M. Morsey
O. Erling
S. Auer
S. Idreos
T. Neumann
Y. Guo
Publication venue
Publication date: 01/01/2014
Field of study

Abstract. The Resource Description Framework (RDF) is a standard for conceptually describing data on the Web, and SPARQL is the query language for RDF. As RDF data continue to be published across heterogeneous domains and integrated at Web-scale such as in the Linked Open Data (LOD) cloud, RDF data management systems are being exposed to queries that are far more diverse and workloads that are far more varied. The first contribution of our work is an indepth experimental analysis that shows existing SPARQL benchmarks are not suitable for testing systems for diverse queries and varied workloads. To address these shortcomings, our second contribution is the Waterloo SPARQL Diversity Test Suite (WatDiv) that provides stress testing tools for RDF data management systems. Using WatDiv, we have been able to reveal issues with existing systems that went unnoticed in evaluations using earlier benchmarks. Specifically, our experiments with five popular RDF data management systems show that they cannot deliver good performance uniformly across workloads. For some queries, there can be as much as five orders of magnitude difference between the query execution time of the fastest and the slowest system while the fastest system on one query may unexpectedly time out on another query. By performing a detailed analysis, we pinpoint these problems to specific types of queries and workloads

CiteSeerX

Crossref

Parallelizing Federated SPARQL Queries in Presence of Replicated Data

Author: A Schwarte
AN Wilschut
C Buil-Aranda
D Bitton
D Kossmann
F Wilcoxon
G Aluç
G Montoya
G Montoya
JD Fernández
M Acosta
M Saleem
MT Özsu
O Görlitz
R Verborgh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

International audienceFederated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX with the replicated-aware source selection Fedra and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively

Crossref

VBN

Approximate Querying on Property Graphs

Author: A Bonifati
A Bonifati
A Khan
D Calvanese
D Hernández
G Aluç
G Bagan
LG Valiant
M Rudolf
PT Wood
R Angles
S Malyshev
Y Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/12/2019
Field of study

International audienc

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL

Hal-Diderot

Bindings-restricted triple pattern fragments

Author: C Bizer
C Buil-Aranda
G Aluç
G Montoya
J Herwegen
J Pérez
J Van Herwegen
JD Fernández
M Acosta
M Sande
M Schmachtenberg
O Hartig
R Verborgh
R Verborgh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The Triple Pattern Fragment (TPF) interface is a recent proposal for reducing server load in Web-based approaches to execute SPARQL queries over public RDF datasets. The price for less overloaded servers is a higher client-side load and a substantial increase in network load (in terms of both the number of HTTP requests and data transfer). In this paper, we propose a slightly extended interface that allows clients to attach intermediate results to triple pattern requests. The response to such a request is expected to contain triples from the underlying dataset that do not only match the given triple pattern (as in the case of TPF), but that are guaranteed to contribute in a join with the given intermediate result. Our hypothesis is that a distributed query execution using this extended interface can reduce the network load (in comparison to a pure TPF-based query execution) without reducing the overall throughput of the client-server system significantly. Our main contribution in this paper is twofold: we empirically verify the hypothesis and provide an extensive experimental comparison of our proposal and TPF

Publikationer från Linköpings universitet

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Eunomos, a legal document and knowledge management system for the Web to provide relevant, reliable and up-to-date information on the law

Author: A Boer
C Cortes
D Gabbay
D Makinson
G Aluç
G Boella
G Boella
G Salton
Guido Boella
J Platt
L Robaldo
Leendert van der Torre
Livio Robaldo
Llio Humphreys
Luigi Di Caro
P Rossi
P Visser
Piercarlo Rossi
R Stamper
T Joachims
T Neumann
W Peters
W Peters
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Graph Generation and Benchmarks

Author: A Bonifati
A Iosup
A Lancichinetti
C Bizer
D Chakrabarti
G Bagan
Güneş Aluç
J Leskovec
J Leskovec
L Backstrom
M Girvan
M McPherson
M Morsey
ME Newman
ME Newman
O Erling
S Boccaletti
TG Armstrong
TG Kolda
Y Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/02/2018
Field of study

International audienc

Crossref

HAL

Hal-Diderot

SigMR: MapReduce-based SPARQL query processing by signature encoding and multi-way join

Author: BH Bloom
C Weiss
Dong-Hyuk Im
Faye Cure
FN Afrati
G Aluç
Hong-Gee Kim
J Dean
J Huang
J Myung
Jh Um
Jinhyun Ahn
L Zou
M Husain
T Berners-Lee
T Neumann
X Cui
Z Kaoudi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Building self-clustering RDF databases using Tunable-LSH

Author: A Morrison
AK Jain
AZ Broder
C Weiss
CC Aggarwal
DJ Abadi
F Goasdoué
F Halim
FF-H Nah
G Aluç
Güneş Aluç
JB Kruskal
JD Foley
Khuzaima Daudjee
KR French
L He
L Sidirourgos
L Zeng
L Zou
M. Tamer Özsu
MF Husain
O Erling
P Jaccard
P Yuan
R Al-Harbi
R Harbi
S Ceri
S Idreos
S Lightstone
T Neumann
T Neumann
VI Levenshtein
Y Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

RDF Stores for Enhanced Living Environments: An Overview

Author: A Callahan
C Bizer
DC Faye
DJ Abadi
F Chang
G Aluç
I Filali
K Zeng
L Sidirourgos
M Mohsin Saleemi
M Salvadores
MM Saleemi
Mohamed Morsey
MT Özsu
ND Rodríguez
O Erling
O Erling
O Erling
R Harbi
T Neumann
T Neumann
V Haarslev
Y Guo
Publication venue: Springer
Publication date: 19/01/2019
Field of study

International audienceHandling large knowledge bases of information from different domains such as the World Wide Web is a complex problem addressed in the Resource Description Framework (RDF) by adding semantic meaning to the data itself. The amount of linked data has brought with it a number of specialized databases that are capable of storing and processing RDF data, called RDF stores. We explore the RDF store landscape with the aim of finding an RDF store that sufficiently meets the storage needs of an enhanced living environment, more concretely the requirements of a Smart Space platform aimed at running on a cluster set up of low-power hardware that can be run locally entirely at home with the purpose of logging data for a reactive assistive system involving, e.g., activity recognition or domotics. We present a literature analysis of RDF stores and identify promising candidates for implementation of consumer Smart Spaces. Based on the insights provided with our study, we conclude by suggesting different relevant aspects of RDF storage systems that need to be considered in Ambient Assisted Living environments and a comparison of available solutions

Crossref

INRIA a CCSD electronic archive server